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(54) Arithnfietic circuit and method 

(57) A multiplier circuit has an encoder (205; 12) 
and a partial product bit generating circuit (14). The 
encoder (205; 12) receives a multiplier bit signal (bj) and 
is used to output a plurality of encode signals. The par- 
tial product bit generating circuit (14) receives the 
encoded signals along with a multiplicand bit signal (ai. 
/ai) from each digit place and is used to generate a par- 
tial product bit for each digit place. The partial product 
bit generating circuit (14) has a first selection circuit 
(201 , 203) which is used to select a logically true signal 



from among the encode signals in accordance with a 
value of the multiplicand bit signal. Therefore, the circuit 
can be reduced in size by reducing the number of nec- 
essary elements without sacrificing its high speed capa- 
bility. 

The circuit further comprises 4-2 compressors and 
sign-correction circuitry. 
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Description 

The present invention relates to an arithmetic circuit and method such as a multiplier circuit, an adder circuit, a par- 
tial product bit compression circuit and method, and a large-scale semiconductor integrated circuit having any such cir- 
cuit. 

In recent years, with rapid advances in manufacturing and design technologies of large-scale semiconductor inte- 
grated circuits, exemplified by microprocessors and digital signal processors, the demand for high-speed, large-scale 
arithmetic circuits has been increasing, in particular, for multiplier circuits that require lengttiy calculations and a larger 
number of circuits, high-speed circuits with a reduced number of elements are needed. 

According to a first aspect of the present invention, there is provided an adder circuit which takes, for each digit 
place, four input signals and one intermediate carry-in signal, and generates one intermediate carry-out signal along 
with a sum signal and a carry signal for output, wherein an OR or NOR signal and an exclusive-OR signal of a first input 
signal and a second input signal from the same digit place are formed, and when the exclusive-OR signal is a first value, 
a third input signal from the same digit place is output as the intermediate carry-out signal, while when the exclusive- 
OR signal is a second value, the OR or NOR signal is output as the intermediate carry-out signal. 

According to a second aspect of the present invention, there Is provided an adder circuit which takes, for each digit 
place, four input signals and one intermediate carry-in signal, and generates one intermediate carry-out signal along 
with a sum signal and a carry signal for output, wherein an AND or NAND signal and an exduslve-OR signal of a first 
input signal and a second input signal from the same digit place are formed, and when the exclusive-OR signal is a first 
value, a third input signal from the same digit place is output as the intermediate carry-out signal, while when the exclu- 
sive-OR signal is a second value, the AND or NAND signal is output as the intermediate can-y-out signal. 

A circuit for generating the sum signal by exclusive-ORing the five input signals may comprise a first exdusive-OR 
circuit, constructed from a single-transfer-gate circuit, for exclusive-ORing the excIusiveOR signal of the first and sec- 
ond input signals with the exclusive-OR signal of the third and fourth signals from the same digit place; and a plurality 
of second exclusive-OR circuits, each constructed from a drive gate circuit or a complementary-transfer-gate circuit, for 
exdusive-ORing the other signals. The first exclusive-OR circuit may comprise six transistors. 

According to a third aspect of the present invention, there is provided an adder circuit which takes, for each digit 
place, four input signals and one intermediate carry-in signal, and generates one intermediate carry-out signal along 
with a sum signal and a can-y signal for output, wherein a circuit for generating the sum signal by exdusive-ORing the 
five input signals conprises a first exdusive-OR circuit, constructed from a single-tiansfer-gate circuit, for exdusive- 
ORing the exclusive-OR signal of the first and second input signals with the exclusive-OR signal of the third and fourth 
signals from the same digit place; and a plurality of second exclusi^^e-OR circuits, each constructed from a drive gate 
circuit or a complementary-transfer-gate circuit, for exclusive-ORing the other signals. 

The adder circuit may be included in a digital multiplier circuit. 

In prefenred embodiments of the present invention it is thus possible to provide an adder circuit (4-2 compression 
circuit: multiplier drcuit), specifically a canry-save adder drcuit with four inputs for each digit place (4-2 compression cir- 
cuit), that can be constructed with 50 or less elements as compared with prior art implementations requiring more than 
50 elements. 

According to a fourth aspect of the present invention, there is provided a multiplier circuit comprising an encoder 
for receiving a multiplier bit signal and for outputting a plurality of encode signals; and a partial product bit generating 
circuit for receiving the encode signals along with a multiplicand bit signal from each digit place and for generating a 
partial product bit for each digit place, the partial product bit generating circuit including a first selection drcuit for select- 
ing a logically true signal from among the encode signals in accordance with a value of the multiplicand bit signal. 

The multiplicand bit signal and its inverted signal may be supplied to the partial product bit generating drcuit. The 
encoder may be a Booth encoder. The encode signals to be selected by the first selection circuit may be signals iden- 
tifying whether a necessary signal as an encoded result is the multiplicand bit signal itself or its inverted signal. 

The partial product bit generating circuit may be loaded with multiplicard bit signals from a plurality of digit places; 
and the partial product bit generating circuit may further include a second selection circuit for selecting, from among the 
plurality of signals selected based on each multiplicand bit signal, a signal that matches the result of encoding in 
accordance with an encode signal different from the selected signals. 

Each of the first and second selection circuits may be constructed from two AND circuits and one NOR drcuit. Each 
of the first and second selection circuits may be constructed from transfer gates. Each of the first and second selection 
circuits may be constructed from two transfer gates. 

The multiplicand bit signal lines for transferring complementary multiplicand bit signals corresponding to multipli- 
cand digits may be arranged in parallel to each other extending in a first direction in a two-dimensional plane, and sets 
of encode signal lines con-esponding to multiplier digits may be arranged extending in a second direction that intersects 
the first direction, while the partial product bit generating circuit is repeatedly an-anged in order to contain a plurality of 
predetermined adjacent intersections of the multiplicand bit signal lines and the encode signal lines. 



EP0 827 069 A2 



According to a fifth aspect of the present invention, there is provided a multiplier circuit utilizing a Booth algorithm, 
comprising a circuit which, instead of a partial product bit signal in accordance with the Booth algorithm, generates for 
each digit place a bit signal corresponding to a sum of a correction value for twos complement of a most significant par- 
tial product and a binary number represented by a bit in a sign digit of a least significant partial product and bits from a 
least significant digit of the most significant partial product to a digit one position lower than the sign digit of the least 
significant partial product. 

According to a sixth aspect of the present invention, there is provided a multiplier circuit utilizing a method for avoid- 
ing sign extension by correction processing, comprising a circuit which performs addition of one for sign correction in a 
digit place one position higher than a sign digit of each partial product, wherein an intermediate can^y-out signal as a 
summation output for a digit place containing the sign digit, or a can-y signal itself, is added in a digit place two positions 
higher, and a NOT signal thereof is added in a digit place one position higher. 

According to a seventh aspect of the present invention, there is provided a multiplier circuit utilizing a Booth algo- 
rithm and also utilizing a method for avoiding sign extension by correction processing, comprising a circuit which, 
instead of a partial product bit signal in accordance with the Booth algorithm, generates for each digit place a bit signal 
corresponding to a sum of a correction value for twos complement of a most significant partial product and a binary 
number represented by a bit in a sign digit of a least significant partial product and bits from a least significant digit of 
the most significant partial product to a digit one position lower than the sign digit of the least significant partial product; 
and a circuit which performs addition of a 1 for sign correction in a digit place one position higher than a sign digit of 
each partial product, wherein an intermediate carry-out signal as a summation output for a digit place containing the 
sign digit, or a carry signal itself, is added in a digit place two positions higher, and a NOT signal thereof is added in a 
digit place one position higher. 

The multiplier circuit may be integrated .together with additional circuitry for implementing signal processing func- 
tions.- arxi may constitute a large-scale semiconductor integrated circuit. 

In preferred embodiments of the present invention, it is thus possible to provide a multiplier circuit which is reduced 
in size by reducing the number of necessary elements without sacrificing its high speed capability. In one specific 
embodiment a partial product bit generating circuit (multiplier circuit) is provide in which the number of necessary ele- 
ments is reduced by half. Further embodiments provide an encoder (Booth encoder) suitable for implementing a partial 
product bit generating circuit, • . - \ v . 

According to an eighth aspect of the present invention, there is provided a partial product bit compression method 
for a nfiultiplier circuit utilizing a-Booth: algorithm, wherein 

• ^. : ' - ■ V i*"- : >'n "ivy s' v .■- . ' • . 

instead of a partial product bit.signal in accordance with the Booth algorithm, a bit signal corresponding to a sum 
of a correction value for twos complement of a most significant partial product and a binary number represented by 
a bit in a sign digit of a least significant partial product and bits from a least significant digit of said most significant 
partial product to a digit one position lower than the sign digit of said least significant partial product, is generated 
directly for each digit place. 

According to a ninth aspect of the present invention, there is provided a partial product bit compression method for 
a multiplier circuit utilizing a method for avoiding sign extension biy correction processing, wherein 

addition of one for sign correction is performed in a digit place one position higher than a sign digit of each partial 
product, and an intermediate carry-out signal as a summation output for a digit place containing said sign digit, or 
a carry signal itself, is added in a digit place two positions higher, while a NOT signal thereof is added in a digit 
place one position higher. 

In preferred emtxxiiments of the present invention it is thus possible to provide a partial product bit compression 
method for reducing the number of partial product bits for each digit place, that can shorten the critical path in partial bit 
compression processing without increasing the number of necessary elements conrpared with the prior art. 

Further aspects of the invention are exemplified by the attached claims. 

For a better understanding of the invention, and to show how the same may be carried into effect, reference will 
now be made, by way of example, to the accompanying drawings, in which:- 

Figure 1 is a block diagram schematically showing an example of a prior art multiplier circuit; 

Figure 2 Is a circuit diagram showing an example of a partial product bit generating circuit in a prior art multiplier 

circuit; 

Figure 3 is a circuit diagram showing another example of a partial product bit generating circuit in a prior art multi- 
plier circuit; 

Figure 4 is a circuit diagram showing still another example of a partial product bit generating circuit in a prior art 
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multiplier circuit; 

Figure 5 is a circuit diagram showing an example of a 4-2 compression circuit constituting a Wallace tree circuit in 
a prior art multiplier circuit; 

Figure 6 is a circuit diagram showing a 10 transistor EOR circuit used in a conventional implementation; 
5 Figure 7 is a circuit diagram showing a sum signal generating circuit of a full-adder circuit constructed from two 6- 

transistor EOR circuits used in a conventional implementation; 

Figures 8A to 8C are diagrams (part I) for explaining the delay of operation caused when a transfer gate is used; 
Figures 9A to 9C are diagrams (part 2) for explaining the delay of operation caused when a transfer gate is used; 
Figure 10 is a circuit diagram showing another example of a 4-2 compression circuit constituting the Wallace tree 
10 circuit in a prior art multiplier circuit; 

Figure 1 1 is a block diagram showing the configuration of a partial product bit generating circuit for a multiplier cir- 
cuit according to an embodiment of the present invention; 

Figure 12 is a circuit diagram showing a selection circuit usable in the partial product bit generating circuit of Figure 
11; 

IS Figure 13 is a circuit diagram showing a partial product bit generating circuit for a multiplier circuit according to 
another embodiment of the present invention; 

Figure 1 4 is a circuit diagram showing a Booth encoder for a multiplier circuit according to still further embodiments 
of the present irtvention; 

Figure 1 5 is a diagram showing the layout of a partial product bit generating circuit for a multiplier circuit emtsodying 
20 the present invention; 

Figure 16 Is a circuit diagram showing an embodiment of a 4-2 compression circuit in the form of a Wallace tree 

circuit for a multiplier circuit according to yet further embodiments of the present invention; 

Figure 17 is a circuit diagram showing another embodiment of a 4-2 compression circuit in the form of a Wallace 

tree circuit; . 
25 Figure 1 8 is a circuit diagram showing still another embodiment of a 4-2 compression circuit In the form of a Wallace 

tree circuit; 

Figure 19 is a diagram for explaining a partial product bit compression method for a multiplier circuit embodying the 
present invention; 

Figure 20 shows a first part of a circuit of an embodiment of the invention for implementing the partial product bit 
30 compression method of Figure 19; : . . 

Figure 21 shows a second part of a circuit of an embodiment of the invention for implementing the partial product 
. bit compression method of Figure 19; and 

Figures 22A to 22B are diagrams for explaining a partial product bit compression method for a multiplier circuit 
according to another embodiment of the present invention. 

Before proceeding to a detailed description of the preferred embodiments of the present invention, aspects of the 
prior art are first described with reference to Figures I to 3. 

It is well known to use a Booth algorithm In conjunction with a Wallace tree to implement a high-speed multiplier 
circuit and Rgure I Is a block diagram schematically showing an example of such a multiplier circuit. 
40 In Figure 1 . reference numeral 1 1 is a multiplier register, 12 is a Booth encoder. 13 is a multiplicand register, 14 is 
a partial product bit generating drcuitr 15 is a Wallace tree circuit. 16 is a carry-propagate adder circuit, and 17 is a 
product register. 

As shown in Figure 1 . each multiplier bit output from the multiplier register II is input as an encode signal to the par- 
tial product bit generating circuit 14 via the Booth encoder 12. On the other hand, a multiplicand bit signal 13 is input 

4S directly to the partial product bit generating circuit 1 4. 

The partial product bit generating circuit 14 generates partial product bits for each digit place from sets of multipli- 
cand bit signals and encode signals. The resulting product bit signals, after appropriate shifting, are Input to a multi- 
stage canry-save adder circuit (4-2 compression circuit, 1 -bit adder circuit, etc.) constituting tiie Wallace tree circuit 15, 
for reduction of bits in each digit column, and cu^e recursively added until the number of summand bits becomes 2 for 

so the same column. When the number of bits Is reduced to 2 for each column, the resulting signals are Input to the carry- 
propagate adder circuit 16 where ordinary two-input addition is performed, and a product signal for each digft place is 
generated and is loaded into the product register 17. 

In tiie multiplier circuit configuration shown in Figure 1 . tiie Booth encoder (tiie encoder), the partial product bit gen- 
erating circuit, and the Wallace tree circuit must be considered separately. 

55 First, the encoder and the partial product bit generating circuit will be described. Both of these circuits are related 
to each other and, tiierefore, need to be considered togetiier. A specific example of an encoder based on a (modified) 
second-order Booth algorittim is disclosed, for example, in Japanese Unexamined Patent Pi^lication (Kbkai) No. 55- 
105732, 
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Figure 2 is a circuit diagram showing one example of the partial product bit generating circuit 14 in the prior art mul- 
tiplier circuit; this example is disclosed in the above Japanese Unexamined Patent PukMication No. 55-105732. Table 1 
below is a truth tat>le for explaining the second-order Booth encoding method. 
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Here, ai is the i-th digit of an m-bit multiplicand (the 0th digit is the least significant digit, and the (m-l)th digit is the 
most significant digit indicating the sign, as in 2*s (twos) complement notation) , and bj is the j-th digit of an n-bit multiplier 
represented in the same notation as the multiplicand. Further. Xj. 2Xj, and Mj are a set of second-order Booth encoder 
output signals for the j-th digit of the multiplier, and indicate that one times the multiplicand, two times the multiplicand. 

25 and the complement of the multiplicandi respectively, are output as the partial product bit signals for that digit place. As 
shown in Table 1, from the values (levels) of the bj-1. bj. and bj+1 of the multiplier. )q. 2)Q. and Mj (PLj) of the corre- 
sponding levels are generated by the encoding operation in the Booth encoder 12. 

The partial product bit generating circuit 14 shown in Figure 2 generates a partial product bit signal Pi.j for- the i-th 
digit of the multiplicand and the j-th digit of the multiplier from the three signals (Xj, 2Xj, and Mj) and the multiplicand bits 

30 (ai and ai-1). With the partial product bit generating circuit 14 of Figure 2, since a set of encoder output signals is gen- 
erated for two multiplier bits as the result of Booth encoding, the number of summand bits' in each digit column is 
reduced by half compared with direct summation; this achieves a faster operating speed and a reduction in the number 
of circuit elements required. 

As shown in Figure 2, this partial product bit generating circuit consists of two AND circuits 101 and 102. a NOR 

35 circuit 103, and an ENOR circuit 104, and requires 18 transistors (elements) when the circuit is implemented using 
CMOS (complementary metal-oxide semiconductor) technology most commonly used for the fabrication of large-scale 
semiconductor integrated circuits. 

Figure 3 is a circuit diagram showing another example of the partial product bit generating circuit 1 4 in the prior art 
multiplier circuit; this example is disclosed in Japanese Unexamined Patent Publication No. 4-227534. 

40 In the case of the partial product bit generating circuit 1 4 shown in Figure 3. the Booth encoder 1 2 outputs five sets 
of signals. SX2 (for +2). SX2* (for -2), SX1 (for +1), SX1 * (for -1), and SO (for 0), and either one of the multiplicand bit 
signal, ai, ai-1 . /ai. or /ai-1 . or a signal fixed to a 1 (high level *'H'^ is selected and is output via an inverter. The partial 
product bit generating circuit of Figure 3 consists of a transistor (P-channel MOS transistor) 1 05, four transfer gates 1 06 
to 109. and five inverters 1 10 to 1 14; the number of necessary elements shown in the figure is 19. 

45 Furthermore, in the partial product bit generating circuit of Figure 3. signal lines for ai and ai-1 (multiplicand bit sig- 
nals) are connected directly to the input terminals of the corresponding transfer gates; in this case, however, unless the 
driving capability of the gates that output ai and ai-1 is sufficiently large, the time constant of the series resistance of 
the transfer and drive gates and the capacitance determined by the sum of the gate capacitance of each inverter within 
the circuit, the capacitance of the line connected to each input terminal, and the source-drain capacitance of each trans- 

50 fer gate, becomes very long, leading to an increase in power consumption because of signal waveform degradation and 
an increase in delay time because of longer signal rise and fall times. 

Accordingly, the same ai signal line can be connected, at most, to two partial product bit generating circuits: if the 
signal is to be supplied to three or more generating circuits, a separate buffer circuit will become necessary. Further- 
more, since inverters are needed for inverting ai and ai-1 . and these additional circuits require 4 elements per bit, a total 

55 of 23 elements (23 transistors) are required. If an inverted signal of each of the SX2. SX2*, SX1 . and SX1 * signals is 
prepared in advance and is input to the partial product bit generating circuit, the irwerters for controlling the transfer 
gates become unnecessary and the number of elements required can be reduced to 15. In that case, however, the 
number of signal lines as a set increases to 9, which is not desirable since this poses a connection problem in LSI imple- 
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mentation. 

Figure 4 is a circuit diagram showing still another example of the partial product bit generating circuit 14 in the prior 
art multiplier circuit; this example is disclosed in Japanese Unexamined Patent Publication No. 5-88852. The partial 
product bit generating circuit 14 shown in Figure 4 consists of a transistor 1 15, two transfer gates 116 and 117. three 
5 inverters 1 18 to 120, and an ENOR circuit 121 . and requires a total of 21 elements. 

Next, the prior art configuration of the Wallace tree circuit 15 will be described. In the Wallace tree circuit 15, the 
partial product bits for each digit place (totalling (ny2 + 1) bits at maximum), generated via the Booth encoder 12, are 
repeatedly compressed using a carry-save adder circuit until the number of bits for each digit place finally becomes 2. 
As the carry-save adder circuit used here, a 1-bit full-adder circuit or a 4-2 conpression circuit is employed. When the 
10 number of bits in each digit place is large, the latter circuit can often perform processing at higher speed. 

Figure 5 is a circuit diagram showing one example of the 4-2 compression circuit constituting the Wallace tree cir- 
cuit 15 in the prior art multiplier circuit: this example is disclosed in Japanese Unexamined Patent Publication No. 2- 
1 1 2020. The 4-2 compression circuit shown in Figure 5 consists of an inverter 1 22, two AND circuits 1 23 and 1 24, three 
OR circuits 125 to 127, three NAND circuits 128 to 130, one NOR circuit, and four EOR circuits 132 to 135. That is. the 
15 4-2 compression circuit exemplified by the configuration shown in Figure 5 requires the use of four EOR circuits 1 32 to 
135 for exclusive-ORing (EORing) five inputs, including an intermediate cary-in signal Gin, in order to generate a sum 
signal Si,j which is one of output signals. 

Figure 6 is a circuit diagram showing a 10-transistor EOR circuit (complementary transfer gate EOR circuit) in a 
conventional implementation, which is used to generate an exclusive-OR signal (EOR signal) of two input signals x1 
20 and x2. Further. Figure 7 is a circuit diagram showing a sum signal generating circuit of a full-adder circuit constructed 
from 6-transistor EOR circuits (single-transfer-gate EOR circuits) EOR1 and EOR2, used in a conventional implemen- 
tation, to generate an EOR signal of three input signals x1 , x2, and x3. ; 

As shown in Figure 6, using CMOS circuit technology commonly employed for the fabrication of LSI circuits, the 
complementary transfer gate EOR circuit usually requires 10 transistors for three inverters 136 to 138 and two transfer 
25 gates 139 and 140. 

Likewise, as shown in Rgure 7. one EOR circuit (EOR1) can be constructed using transistors 141 and 142. a trans- 
fer gate 143, and an inverter 144. That is, ttie EOR circuit can be constructed vflth six transistors (elements). This 6- 
transistor single-transfer-gate EOR circuit constructed with six transistors is cited in many documents, for example, Jap- 
anese Unexamined Patent Publication Nos. 58-21 1252. 59-211 138, S1-262928, and 4-227534. 

30 However, the processing speed of the above 6-transistor EOR.circuil ss much slower than that of the 10-transistor 
EOR circuit shown in Figure 6. and in particular., when two EOR circuits are directly coupled together, as shown in Fig- 
ure 7, the processing speed becomes intolerably slow. The reason is as follows. 

In Rgure 7. suppose that now XI = "1". X2 s/'l". and X3 = "1" and at the next time instant these input conditions 
change to XI = "0", X2 = "0". and X3 = "r. This causes the N-channel transistor 142 (T1) to change from the ON to the 

35 OFF state and the P-channel transistors 141 (T2)and 140 (T3)frcmtheOFFtothe ON state. Asaresult, currentwhich 
was flowing via node N2 now flows via node N3. but since the nodes N2 and N3 are electrically isolated (shut off) from 
each other, the change at the node N2 does not affect the node 3, nor does the change at the node N3 affect the node 
2. Furthermore, the potential at node NT changes only slowly because it does not change until the cun-ent begins to 
flow through the series circuit of the high-resistance P-channel transistors T2 and T3. 

40 On the other hand, the switch circuit (transfer gates 139 and 140) shown in Figure 6 is constructed with P-channel 
and N-channel transistors connected in parallel, so that when one transistor is turned off, the other transistor is turned 
on, the state changes of both transistors affecting each other to accelerate the change, thus achieving high-speed 
switching. As described above, the 6-transistor EOR circuit uses fewer elements but sacrifices the operating speed, and 
is therefore not suitable for circuit applications requiring high operating $peed. On the other hand, if the Wallace tree 

45 circuit shown in Figure 5 is constructed using the 10-transistor EOR circuits, a total of 66 elements (66 transistors) will 
be required. 

Figures 8A to 8C and Figures 9A to 9C are diagrams for explaining how the delay of operation is caused when a 
transfer gate is used. 

Rrst. consider the case shown in Figure 8A where signal S from the drive gate DG is supplied to the transfer gate 
50 TG and a plurality of loads LG (for example, n inverters) are driven by the output of the transfer gate TG. Then the equiv- 
alent circuit will be as shown in Figure 8B. As shown, the equivalent circuit is represented by the voltage source E and 
resistance Rd of the drive gate DG, the resistance (ON resistance) of the transfer gate TG, the capacitances Cs and 
Cd of the source and drain of the transistor constituting the transfer gate TG, and the input capacitances Cg of the n 
loads LG. 

55 To simplify the explanation, if the equivalent circuit of Rgure 8B is rewritten as shown in Figure 8C, delay time t1 in 
the rising and falling of the signal S is given by 

x1 oc (Rd + Rt) • (Cs + Cd + nCg) ^ 3/2 • (n + 2) • Cg • Rd 
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Here, it is assumed that Rt 1/2 • Rd and Cs % Cd ^ Cg . In the case of a miniaturized semicx)nductor device, for 
example, a CMOS device with a line width of about 0.35 ^m. Cs % Cd » Cg substantially holds. 

In this way. when driving the loads LG by the transfer gate TG. the transfer gate TG must drive each load LG with 
its input capacitance multiplied by 1 .5 as shown above, as a result of which the delay time t1 increases (driving speed 

5 decreases). Furthermore, considering the effects of +2 in the term (n + 2) in the above equation, the increase in the 
delay time t1 becomes larger than when the delay time is simply proportional to the number n. 

On the other hand, in the case shown in Figure 9A where the signal S from the drive gate DG is supplied directly 
to drive a plurality of loads LG (for example, n' inverters), the equivalent circuit will be as shown in Figure 9B. The equiv- 
alent circuit of Rgure 9B can be further simplified as shown in Figure SC. Here, delay time x2 in the rising and felling of 

10 the signal S is given by 

t2 oc n* • Rd • Cg 

When driving the n' loads LG by the drive gate DG shown in Figures 9A to 9C. the drive gate DG need only drive 

IS the input capacitance Cg of each load LG multiplied by the number n', achieving a higher driving speed than when the 
n loads LG are driven by the transfer gate TG shown in Figures 8A to 8C. 

Figure 1 0 is a circuit diagram showing another example of the 4-2 compression circuit constituting the Wallace tree 
circuit in the prior art multiplier circuit; this example shows the configuration of the 4-2 compression circuit described in 
the paper (pp. 251-257) carried in the March 1995 issue of "IEEE Journal of Solid-State Circuits", a magazine issued 

20 by The Institute of Electrical and Electronics Engineers, Inc. (IEEE), U.S.A. As shown in Figure 10, the 4-2 connpression 
circuit (Dual Pass Transistor Logic (DPL) 4- 2 Conrpressor) consists of eight inverters 145 to 152 and nine complemen- 
tary transfer gates 153 to 161 , using a total of 52 elements (52 transistors). However, in the 4-2 compression circuit of 
Figure 10 also, since the series resistance of transfer gates is directly connected to each signal input terminal, if the 
input capacitance is large, for example, when a long connecting line is connected to the input terminal, or if there exists 

25 an additional input resistance, the signal rise and fall times increase and the processing speed decreases. - 

In the Wallace tree circuity if the number of bits to be compressed for each digit place can be reduced, the nurnber 
of compression circuits required can be reduced, as a result of which the size of the multiplier circuit can be reduced. - € 

In the prior art, the reduction of the hutnbdr of bits'for each digit place has been addressed primarily from the standpoint > ^ 
of compressing the sign bits of partial products. Specific examples are disdbsed, for example, in Japanese Unexam- 

30 ined Patent Publication Nos. 57-121736 arid 59-3634.^ ^ ^- ^ 

Further, Japanese Un&camined Patent Publication No.' 4-287220 discloses a connection mekns which enables the 
correct calculation result to be obtained without extending the sign of each partial product to a higher digit place when :v 
adding together partial products (multiples) In 2's complement representation, and can-ies a description of a multiplier r 
circuit in which adders for adding sets of partial products are modified so tiiat. without requiring an independent correc- 

35 tion circuit for performing operations on the signs of the partial products and adding correction values, calculation ' ^ 
results conparable to the correction circuit can be obtained using the connection means. However, with the method dis- ^> 
closed in Japanese Unexamined Patent Publication No. 4-287220. though the circuitry relating to the sign digit can be /*■ 
simplified, the critical path in the Wallace tree circuit cannot be shortened because the maximum number (n/2 + 1) of 
partial product bits for each digit place cannot be changed when using the second-order Booth algorithm. 

40 Further, in General Lecture C-541 (A Coll^on of Lecture Papers. Electronics 2, p. 1 57), 1 996 General Convention 
of The Institute of Electronics, Information and Communication Engineers, a method is proposed in which 2's comple- 
ment is obtained in advance for the low-order 5 bits of the multiplicand, and when the result of Booth encoding is neg- 
ative, the 2*s complement bits thus obtained are output instead of the usual partial product bits, thereby reducing the 
maximum number of partial product bits for each digit place to nl2 bits and thus achieving faster operation. While this 

45 method can reduce the maximum number of partial product bits for each digit place and thereby shorten the critical path 
in the Wallace tree circuit, a delay longer than the delay introduced by the Booth encoder occurs when generating the 
complement of 5 bits, and further, a circuit for generating the complement of 5 bits is required; therefore, the effect of 
increasing the speed by reducing the maximum number of partial products is offset by these disadvantages, and rather, 
the number of elements in the multiplier circuit as a whole may increase in some cases. 

so Referring again to the encoder and the partial product bit generating circuit, the prior art partial product bit gener- 
ating circuit 14 requires about 20 elements per bit, and while the number of bits to be added together for each digit place 
is reduced by half by Booth encoding, the reduction in the total number of circuit elements is not as large as expected 
since a large number of elements are required for generation of the partial product bits. In particular, when the number, 
n, of multiplier bits is smalt, the total number of elements may increase. 

ss As described, various efforts have been made in the prior art to Increase the speed of multiplier circuits and to 
reduce the number of necessary elements, hilt there remains a need for still faster multiplier circuits with fewer ele- 
ments. 

Embodiments of the invention are now described described with reference to Figures 1 1 to 21 of the accompanying 
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drawings. 

First, a partial product bit generating circuit (multiplier circuit) employing a configuration that can reduce the number 
of necessary elements by half, according to a first embodiment of the invention, and an embodiment of an encoder 
(Booth encoder) suitable for implementing the partial product bit generating circuit will be described. 

5 Figure 1 1 is a block diagram showing the configuration of the partial product bit generating circuit 1 4 in the multiplier 

circuit. In the figure, reference numeral 201 is a first selection circuit for a multiplicand bit ai in tiie i-th digit place; 202 is 
a second selection circuit for the multiplicand bit ai; 203 is a first selection circuit for a multiplicand bit ai+1 in the (i+1)th 
digit place; and 204 is a second selection circuit for the multiplicand bit ai+1 . Further, reference numeral 205 indicates 
an encoder which generates encode signals B1 j, B2j, B3j, and /B3j by using a multiplier bit bj in the j-th digit place and 

10 the multiplier bits in its next higher and lower digit places. 

The first selection circuit 201 for the multiplicand bit ai in the I-th digit place outputs tiie encode signal Bij when tine 
multiplicand bit ai is a 1, and outputs the encode signal B2j when the multiplicand bit ai is a 0. The other selection cir- 
cuits 202 to 204 have similar functions. 

Now, suppose that the encoder (1 2) is a second-order Booth encoder, and that Bij = PLj (tiie result of encoding is 

IS positive), B2j = Mj (the result of encoding is negative), B3j = Xj (the result of encoding is one times the multiplicand), 
and /B3j = /Xj , as shown in the previously described Table 1 as a trutii table. Then, the second selection circuit 202 for 
the multiplicand bit ai in the i-th digit place outputs a partial product bit Pi.j for the multiplicand bit ai and the multiplier 
bit bj. while the second selection circuit 204 for the multiplicand bit ai+1 in tiie (i+1 )tii digit place outputs a partial product 
bit Pi+1 .j for tiie multiplicand bit ai+1 and the multiplier bit bj. 

20 That is, the partial product bit generating circuit of Figure 1 1 generates partial product bits for two successive bits 
(ai. ai+1) of the multiplicand. Here, reference sign Ei-I.j indicates the signal selected and output from a first selection 
circuit (203) for a multiplicand bit ai-1 in the next lower digit place. The signal Ei+1 .j selected and output from the circuit 
of Figure 1 1 is therefore used to generate a partial product bit Pi+2.j for the next higher digit place. As will be described 
later, according to the first embodiment of the invention, the circuit configuration shown in Figure 1 1 allows the selection 

25 circuits (201 to 204) to be constructed using fewer elements than EOR circuits, as a result of which partial product bits 
can be generated using fewer elements than the prior art and without reducing tiie processing speed. . 

Figure 1 2 is a circuit diagram showing one example of tiie configuration for^he selection circuits (201 to 204) usable 
for a partial product bit generating circuit of a multiplier circuit. : 'v -y": ^e; » . 

As shown in Figure 12, the selection circuit configuration (for the fiist;Selection circuits 201. 203 and tiie second 

30 selection circuits 202, 204) consists of two AND circuits 301, 302 and a NOR circuit 303. Here, the signal output from 
the first selection circuit 201 . 203 is inverted with respect to the selected signal, but this inverted signal is again inverted 
by the subsequent second selection circuit 202, 204 whose oiitput is tiierefore the non-inverted signal of the selected 
signal. In the present embodiment, the partial product bit generatingcircuit (for two bits) requires 8 x 4 = 32 elements 
(transistors), that is, 1 6 elements per bit. It is thus shown that the partial product bit generating circuit (with 1 6 elements 

35 per bit) of the present embodiment can be constructed with fewer elements than any of the partial product bit generating 
circuits (14) of the prior art shown in Figures 2 (18 elements per bit), 3 (19 elements per bit), and 4 (21 elements per bit). 

Figure 13 is a circuit diagram showing anotiier embodiment of a partial product bit generating circuit 14 for a mul- 
tiplier circuit This embodiment is constructed to further reduce tiie number of elements compared with the partial prod- 
uct bit generating circuit constructed witii the selection circuits shown in Figure 12. 

40 As shown in Rgure 1 3. the partial product bit generating circuit (for two bits) of this embodiment is constructed witti 
eight transfer gates 304 to 31 1 and two inverters 312 and 313. In comparison with the partial product bit generating cir- 
cuit 1 4 of Figure 1 1 . the first selection circuit 201 and second selection circuit 202 for the multiplicand bit ai in the i-tfi 
digit place are constructed from the transfer gates 306, 307 and transfer gates 310. 31 1, respectively, while the first 
selection circuit 203 and second selection circuit 204 for the multiplicand bit ai+1 in the (i+1)th digit place are con- 

45 structed from the transfer gates 304. 305 and transfer gates 308. 309. respectively 

The output signal Pi+1 ,j from the second selection circuit 204 for the multiplicand bit ai+1 and the output signal Pi.j 
from the second selection circuit 202 for the multiplicand bit ai can be written by logic expressions as shown below, from 
which it can be seen that two successive partial product bits are generated. 

so Pi+1 j = (py . ai+1 + Mj -/ai+l) • Xj + (Py • ai + Mj -/ai) • /Xj 

Pi.j = (Pg • ai + Mj -/ai) • )q + (Py • ai-1 + Mj -/ai-l) •/)q 

In tiie partial product bit generating circuit (for two bits) shown in Figure 13. the number of elements required is 20, 
55 which means 10 elements (10 transistors) per bit. achieving a reduction by a factor of about 2 compared with the prior 
art partial product bit generating circuits (18 to 21 elements per bit) shown in Figures 2 to 4. 

Here, the inverters 312 and 313, to which Booth encode signals /PLj (complement of the signal output when the 
result of encoding is positive) and /Mj (complement of the signal output when tiie result of encoding is negative) are 
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respectively input, are provided to avoid the problems that could arise if the encode signals were applied directly to the 
input terminals of the transfer gates (the problems described in connection with Figure 3. such as increased power con- 
sumption due to signal waveform degradation and increased delay time due to longer signal rise and fait times). These 
inverters would be unnecessary if the logic gates for outputting the encode signals had sufficient driving capability, if 

5 the Inverters 312 and 313 are not necessary, the number of elements required to construct the partial product bit gen- 
erating circuit (for two bits) can be reduced below 20. There are further required inverters for creating the complements 
of the multiplicand bits ai. ai-f 1 . ai-1 . etc., but since the multiplicand bit signals and their complements are applied to the 
gate terminals of the transfer gates, inputs to 1 0 or more partial product bit generating circuits can be supplied from one 
signal buffer, provided that the driving capability of the signal buffer is sufficiently large, and if the number, n. of bits in 

10 the multrplier is 1 6 or over, for example, the addition of these inverters can be ignored as a fector contributing to increas- 
ing the number of elements. 

Figure 14 is a circuit diagram showing one example of a Booth encoder usable in a multiplier circuit. The figure 
shows one example of the second-order Booth encode signal generating circuit (encoder) 12 for tiie circuit shown in 
Figure 13. The encode signal symbols shown correspond to the symbols shown in Table 1. 

15 As shown in Figure 14, the Bootii encoder 12 comprises seven inverters 314 to 320, an AND circuit 321. an OR 
circuit 322, a NAND circuit 323, a NOR circuit 324. and an ENOR circuit 325. Here, some of the inverters located at the 
output side of the encode signals can be omitted in cases where an encode signal buffer with a low driving capability is 
acceptable: for example, the pairs of successive inverters (315/316 and 317/318) located in tiie output paths of the 
encode signals /Xj and /PLj can be omitted. The number of elements required to construct tiie Booth encoder 12 may 

20 increase somewhat as compared with the prior art. but in tiie case of a multibit multiplier circuit in which the number, m, 
of bits in the multiplicand is 16 or more, this increase has littie effect because only one encoder is needed for every two 
multiplier bits. . 

The above examples have been described assuming the use of a second-order Bootii encoder, but similar circuits 
can also be constructed when using a third- or higher-order Booth encoder; for example, in Figure 11. this can be 

25 accomplished by constructing the second selection circuit as a 4:1 selection circuit, in which case one of four outputs 
selected In accordance with the values of tfie multiplicand bits, ai, ai-1 . a1 (the i-th digit of three times the multiplicand), 
and ai-2. is selected for output in corresponding relationship to an encode output that has a positive value but of the 
encode outputs Xj. 2Xj. 3Xj (three times the multiplicand is selected), and 4Xj (four times the multiplicand is selected). 
Further, the metiiod of encoding is not necessarily limited to Bootii's method, but other encoding methods similar to the 

30 one employed in the present embodiment may be applied directly for the circuit of Figure 11 or by partially modifying 
thecircuit. ' • - . :> • .: 

Thus, according to the first embodiment of the invention, there are jsrovided a partial product bit generating circuit 
(multiplier circuit) of a configuration that can reduce the number of necessary elements by half, and an encoder (Booth 
encoder) suitable for implementing the .partial product bit generating circuit. 

35 . For a high-speed multiplier circuit suitable for implementation as an integrated (LSI) circuit, a layout method suitable 
for implementation in integrated circuit form must be considered besides reducing the number of required relements 
without sacrificing its high speed capability. This subject is briefly described in the present inventor's paper on a multi- 
plier carried in the September 1992 issue of **IEEE Journal of Solid-State Circuits," pp. 1229-1236. However, since the 
present invention employs a circuit different from the one described in that paper, the layout must be optimized as well. 

40 A method for accomplishing this will be described below. 

Figure 15 is a diagram showing the layout of the partial product bit generating circuit for a multiplier circuit. 
As shown in Figure 15. multiplicand bit signals (ai. ai+l) con-esponding to particular digits in the multiplicand and 
their inverted signals (/ai, /aM) are arranged parallel to each other extending in one direction (vertical direction) in a 
two-dimensional plane, and sets of encode signal lines (Bj, Bj-i-2. ...) con-esponding to particular digits in the multiplier 

45 are arranged intersecting them (using horizontally extending wiring patterns on a different layer from the layer on which 
the multiplicand bit signal lines are formed). In Figure 15, the complementary multiplicand bit signals ai and /ai, for 
example, are represented by a single signal line extending vertically on tiie left side of a through wiring region. Refer- 
ence sign 4W in Figure 15 indicates a 4-2 compression circuit which will be described later. 

Further, a partial product bit generating circuit (P2: 1 4) is repeated in such a manner as to contain a plurality of pre- 

so determined adjacent intersections of the multiplicand bit signal lines and encode signal lines. By repeating the same 
circuit cell in this way. the partial product generating circuits can be formed one adjacent to another, thus making it easy 
to lay out the major elements of the multiplier circuit. When a third embodiment of the invention is applied, part of ^e 
circuit is replaced by an irregular circuit, but since this affects only part of the partial product bit generating circuit and 
does not disturb the ordered tree arrangement of the adder circuits, the layout as a whole remains substantially unaf- 

55 fected. 

Next, a carry-save adder circuit according to a second emt)odiment of the invention will be described, the circuit 
having four inputs for each digit place (4-2 compression circuit) and being capable of reducing the number of circuit ele- 
ments necessary. 



9 



EP0 827069A2 



In a first configuration of the carry-save adder circuit of the second embodiment (4-2 compression circuit) which, 
for each digit place, takes four input signals (x1 . x2. x3, x4) and an intermediate canry-in signal (Gin) as inputs and gen- 
erates an intermediate can'y-out signal (Gout), a sum signal (Si.j), and a can-y signal (CiJ) for output, an OR (or NOR) 
and an exclusive-OR (EOR) of the first input signal x1 and second input signal x2 are formed and. when the exclusive- 

5 OR output is true (logic 1), the third input signal x3 is output as the intermediate carry-out signal (Gout), while when the 
EOR output is telse (logic 0), the OR output signal is output as the intermediate carry-out signal (Gout). In a second 
configuration of the 4-2 compression circuit, an AND (or NAND) and an EOR of the first input signal x1 and second input 
signal x2 are formed and, when the EOR output is true (logic 1), the third Input signal x3 is output as the intermediate 
carry-out signal, while when the EOR output is false (logic 0). the AND output signal is output as the intermediate carry- 

10 out signal. 

In each of the first and second configurations, of the circuits that exclusive-OR the five input signals to produce the 
sum signal, only the circuit that exdusive-ORs the EOR output signal of the first and second input signals with the EOR 
output signal of the third and fourth input signals is constructed from a 6-transistor circuit (a single-transfer-gate circuit), 
and all the other EOR circuits are constructed from circuits of eight or more transistors. 

IS In the circuit of the first configuration, the circuit that produces the OR (or NOR) can be used as part of the circuit 
that exclusive-ORs the first and second inputs for each digit place; likewise, in the circuit of the second configuration, 
the circuit that produces the AND (or NAND) can be used as part of the circuit that exclusive-ORs the first and second 
inputs for each digit place. Furthermore, the number of component elements is reduced by selecting the third input sig- 
nal or the OR/AND signal for output as the intermediate carry-out signal. Moreover, by using the 6-transistor EOR cir- 

20 cuit. which is expected to present a problem in terms of speed, only in a section where the speed is least affected, the 
total number of elements can be reduced without substantially reducing the speed. That is, according to the adder cir- 
cuit (4-2 compression circuit) of the present invention, the circuit which required more than 50 elements in the prior art 
can be constructed with 50 or less elements without sacrif forng its high speed capability. 

Figure 16 is a circuit diagram showing an embodiment of a 4-2 compression circuit constituting a Wallace tree cir- 

25 cuit for a multiplier circuit. 

. As shown in Figure 16, the 4-2 compression circuit of this embodiment comprises 10 transfer gates 401 to 410, 10 
inverters 41 1 to 420, an OR circuit 421 . and two NAND circuits 422 and 423. Here, an EOR circuit 430 (consisting of 
an ENOR circuit and an inverter and corresponding to the EOR circuit 133 in Rgure 5) is constructed with the inverter 
412. the OR circuit 421 . and the NAND circuits 422 and 423; an EOR circuit 440. (corresponding to the EOR circuit 132 

30 in Figure 5) is constructed with the transfer gates 403 and 404 and the inverters 414, 415, and 420; an EOR circuit 450 
(corresponding to the EOR circuit 134 in Figure 5) is constructed with the transfer gates 405 and 406 and the inverters 
412, 415, and 416; and an EOR circuit 460 (corresponding to the EOR circuit 135 in Figure 5) is constructed with the 
transfer gates 407 and 408 and the inverters 416, 417, and 418. 

As is apparent from Figure 16. the EOR circuits 440. 450, and 460 are each constructed from the 10-transistor 

35 EOR circuit (complementary transfer gate circuit) previously shewn in Figure 6, to maintain the high speed capability of 
the circuit. Further, the inverter 412 is provided common to the EOR circuits 430 and 450. the inverter 415 common to 
the EOR drcuits 440 and 450. and the inverter 416 common to the EOR circuits 450 and 460, to reduce the number of 
elements. The EOR circuit 430 is constructed from the ENOR circuit (421 . 422, 423) and the inverter 412. using a total 
of 12 transistors. 

40 In the 4-2 compression circuit shown in Figure 16, a NAND signal of the first input signal x1 and second input signal 
x2 and an exdusive-OR signal (EOR signal) of x1 and x2 are produced, and when the EOR output (the output of the 
inverter 412) is true (logic 1), the third input signal x3 is output as the intermediate carry-out signal (Gout), while when 
the EOR output is false (logic 0). a NOT signal of the NAND signal (the output of the NAND circuit 422) is output as the 
intermediate carry-out signal. In the illustrated example, the EOR circuits 440, 450. and 460, excluding the EOR drcuit 

45 430 constructed from a drive gate circuit, are each constructed with inverters and transfer gates of 8 to 10 transistors, 
and the entire circuit is constructed using 50 elements (50 transistors). 

Figure 1 7 is a circuit diagram showing another embodiment of a 4-2 compression drcuit.constituting a Wallace tree 
circuit, for a multiplier drcuit The .4-2 compression circuit shown in Figure 1 7 achieves a further reduction in the number 
of elements compared with the 4-2 compression circuit shown in Figure 16. 

so More specifically, in the 4-2 compression circuit of Figure 1 7. of the three stages of EOR circuits 440. 450, and 460 
in the 4-2 compression drcuit of Figure 16. the EOR circuit (450) located in the intermediate position where the effects 
of the input and output are least felt is replaced by a 6-transistor EOR drcuit 450' which is the same one as that shown 
in Figure 7, thereby achieving a further reduction in the number of elements while retaining the high speed capability of 
the drcuit. 

55 As shown in Figure 17. the circuit (EOR circuit 450*) that exclusive-ORs the exduslve OR (EOR signal) of the first 
input signal x1 and second input signal x2 with the exdusive OR (EOR signal) of the third input signal x3 and fourth 
input signal x4 is constructed from the 6-transi8tor EOR circuit. That is, while the EOR circuit 450 in Figure 16 is con- 
structed with the transfer gates 405 and 406 and the inverters 412, 415, and 416, using a total of 10 transistors, the 
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EOR circuit 450' in Figure 1 7 is constructed using six transistors, i.e., transistors 424 and 425. a transfer gate 426, and 
an inverter 412. As a result, the 4-2 compression circuit shown in Figure 17 can be constructed using 48 elements (48 
transistors). 

Figure 18 is a circuit diagram showing still another embodiment of a 4-2 compression circuit, constituting a Wallace 

5 tree circuit for a multiplier circuit In this emk>odiment also, the 4-2 compression circuit is constructed using 48 elements. 
As shown in Figure 18, a NOR signal of the first input signal x1 and second input signal x2 and an exclusive-OR 
(EOR) signal of x1 and x2 are generated, and when the EOR output (the output of the NOR circuit 429) is true (logic 1), 
the tiiird input signal x3 is output as the intermediate cary-out signal (Gout), while when the EOR output is false (logic 
0). a NOT signal of the NOR signal (the output of the NOR circuit 428) is output as the intermediate carry-out signal. 

10 In the 4-2 compression circuits shown in Figures 16 to 18, the number of necessary elements is reduced to 50 or 
less, and yet the circuit is unaffected by the delay through the transfer gates explained in the description of the prior art. 
that is. the reduction is achieved without sacrificing the processing speed. 

A third embodiment of the invention hereinafter described is concerned with a partial product bit compression 
method for redudng the number of partial product bits for each digit place, that can shorten the critical path in partial bit 

IS compression processing without increasing the number of necessary elements compared with the prior art. 

First, features of the partial product bit compression method will be briefly described. As a first feature, a multiplier 
circuit utilizing a Booth algorithm contains a circuit which, instead of a partial product bit signal in accordance with a 
Booth algorithm, directly generates for each digit place a bit signal corresponding to the sum of a correction value for 
2*s complement of the most significant partial product and a binary number represented by a bit in the sign digit of the 

20 least significant partial product and bite from the least significant digit of the most significant partial product to the digit 
one place lower than tiie sign digit of the least significant partial product. As a second feature, a multiplier circuit utilizing 
a method for avoiding sign extension by correction processing contains a circuit which performs the addition of a 1 for 
sign correction in a digit place one position higher than tiie sign digit of each partial product, wherein an intermediate 
carry-out signal as a summation output for a digit place one position lower than the digit containing the correction term, 

25 or a carry signal Itself, is added in a digit place two positions higher, and an inverted signal thereof is added in a digit 
place one position higher, thereby apparently eliminating tiie addition of the 1 in the summation of bits for each place. 

With the first feature of the partial product bit compression method it becomes possible to reduce tiie maximum 
number of partial product bits without having to provide an additional circuit for 2*s complement generation as required 
in the prior art. Further, with the second feature of the partial product bit compression method by performing sign digit 

30 pirocessing as part of the processing in the psirtial product bit compression circuit it becomes possible to rcHduce the 
number of elements in the processing circuif as a whole! Thus, a multiplier circuit faster than the prior art can be con- 
structed without having to provide the additional circuitry that contributed to increasing the critical path in the Wallace 
tree circuit in the prior art as previously described. In this way, with the first and second features of the partial product 
bit compression method, a multiplier circuit can be constructed which is capable of multiplier circuit capable of perform- 

3S ing operations at higher speed tiian the prior art, while reducing tiie number of elements compared with the prior art. 

Figure 19 is a diagram for explaining one example of tiie partial product bit compression method employable in a 
multiplier circuit. 

In Figure 1 9. the above-described metiiod (the first feature of the partial product bit compression method) is applied 
for multiplication of 8 x 8 bits utilizing a second-ofder Booth algoritiim, as an example. By applying the third embodiment 
40 of the invention (tiie partial product bit compression method for the multiplier circuit), tiie number of partial products to 
be processed, usually five partial products, can be reduced to four. This, for example, allows the direct use of the 4-2 
compression circuit, and leads to a reduction in circuit size. Of tiie partial product bits, the following five bits are affected 
by the application of the invention. 

Most significant partial product: PM0.6 and PM1 .6 
4S Least significant partial product: PS10. PS20, and PS30 

The conespondence with the original values is given by the following relation. 

PSO /PSO /PSO PI, 6 PC, 6 

so +) : JUS. 

PS30 PS20 PSIO PM1,6 PMC, 6 

Therefore, logic of the above five bits should be so determined as to satisfy the atx>ve relation. More specifically, the 
55 following logic expressions are determined. 

PM0.6 = P0.6eM6 « aO • X6 • (M6+PL6) 
PM1.6 =P1.6©(P0,6*M6) 
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= X6 • (M6 • (aOea1 )+PL6 • a1 ) + /X6 • aO • (M6+PL6) 
PC6 = P1,6*P0.6*M6 

= M6»/aO«(/a1 •X6+/X6) 
e1.6 = PLj6*a1 + M6»/a1 
5 PS1 0 = PCee/PSO = PC6e(M0 • /as+PLO • as) 
PS20 = /PS0e{PC6 • /PSO) = /PC6 • /PSO 
PS30 = PS0e(PC6 • /PSO) = PC6 + PSO 

Here. PC6 is a carry signal from the digit of PM1.6 to the higher-order digit, and el .6 is a signal necessary to obtain 

10 P2,6. The least significant partial product is also included because, as can be seen from the above expressions, chang- 
ing the sign digit portion of the partial product bits makes it easier to simplify the logic and enhances the effect of the 
reduction in the number of elements. Examples of logic circuit configurations corresponding to the above seven logic 
expressions are shown in Figures 20 and 21 . 

Figures 20 and 21 are circuit diagrams showing one embodiment for implementing the partial product bit compres- 

15 sion method of Figure 19. 

As shown in Figures 20 and 21 , the logic circuits corresponding to the above seven logic expressions together com- 
prise: five inverters 501 to 503, 51 1 . and 512; three AND circuits 504, 513, and 514; six OR circuits 505. 506. 515, 516, 
523, and 524; six NAND circuits 507. 508, 51 7. 518, 519. and 525; three NOR circuits 509. 520. and 521 ; and two EOR 
circuits 510 arxi 522, the total number of elements being 88. In some exanples of the prior art, circuitry corresponding 

20 to the above circuitry is achieved with a smaller number of elements but. in preferred en^odiments of the present inven- 
tion, the reduction in the number of elements in the summation circuitry for compressing partial products bits for each 
digit place achieved by the reduction of bits for each digit place is far greater than the increase, which means that the 
number of elements as a whole is reduced. 

Figures 22A and 22B are diagrams for explaining the partial product bit compression method for a multiplier circuit 

25 according to another embodiment of the present invention. Figure 22A is for explaining an example of a prior art method 
of partial product bit compression, and Figure 22B is for explaining the partial product bit compression method accord- 
ing to the present embodiment of the invention. More specifically.. Figure 22B concerns the second feature of the partial 
product bit compressfion method described above when it is applied to bit compression in the 10th and higher digit 
places. 

30 As shown in Figure 22A, the partial product bit compression method of the prior art requires two 4-2 compression 
circuit (4W), a 1-bit full-adder circuit (3W). and a halfradder circuit (2\A0. requiring a total of 134 elements (134 transis- 
tors). 

On the other tiand, as shown in Figure 228. - when . the. partial product bit compression method of the present 
embodiment of the invention is used, the necessary circuitry can be constructed using a 4-2 compression circuit (4W). 

35 two1-bitfull-adderdrcuit(3V\0. and three inverters, i.e. a total of 102 elements (102 tr . 

As shown in Figure 22B, the intermediate carry-out signal d in the 10th digit place is input after Inversion to 3W in 
the 1 1th digit place, while the same signal d is input directly to 3W in the 12th digit place for processing. The partial 
product bit signal P in the 1 3th digit place is inverted by an inverter and processed as the sum signal S, and the inverted 
signal of the sum signal S is processed as the carry signal C to the 1 4th digit place. In this way. by applying the partial 

40 product bit compression method of the present embodiment of the invention, compression of bits in each digit place can 
be accomplished using fewer elements than the prior art without reducing the processing speed. 

Thus, the partial product bit compression method of the third embodiment of the invention can achieve a reduction 
in the number of partial product bits for each digit place and shorten the critical path in partial product bit compression 
processing without increasing the number of necessary elements compared with the prior art 

45 By combining the first, second, and third embodiments of the invention described above, the number of elements 
can be reduced by as much as 30 to 40% conrpared with the prior art and yet such a multiplier circuit can have a 
processing speed comparable to that of a prior art multiplier circuit. Furthermore, if the savings in the number of ele- 
ments are used to construct additional circuits for implementing specific signal processing functions, an integrated cir- 
cuit with enhanced functionality can be realized using the same manufacturing technology. 

so Moreover, as has been described in detail above, by applying the first, second, and third embodiments of the inven- 
tion in a combination suited to the required specifications of the multiplier circuit desired, multiplier circuits can be 
reduced in size by reducing the number of elements necessary without sacrificing the high speed capability of the circuit 
concerned. 

Many different emtKXiiments of the present invention may be constructed without departing from the spirit and 
55 scope of the present invention, and it should be understood that the present invention is not limited to the specific 
embodiments described in this specif ication. 
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Claims 

1. An adder circuit which takes, for each digit place, four input signals (x1 . x2. x3, x4) and one intermediate carry-in 
signal (Cin), and generates one Intermediate carry-out signal (Cout) along wKh a sum signal (Si. j) and a carry sig- 

5 nal (G, j) for output, wherein 

an OR or NOR signal and an exclusive-OR signal of a first input signal (x1) and a second input signal (x2) from 
the same digit place are formed, and when the exclusive-OR signal is a first value f 1 a third input signal (x3) 
from the same digit place is output as the intermediate carry-out signal (Cout). while when the exdusive-GR 
10 signal is a second value ("0"). the OR or NOR signal is output as the intermediate carry-out signal. 

2. An adder circuit which takes, for each digit place, four input signals (x1 . x2. x3. x4) and one intermediate carry-in 
signal (Gin), and generates one intermediate carry-out signal (Cout) along with a sum signal (Si, j) and a carry sig- 
nal (Ci, j) for output, wherein 

IS 

an AND or NAND signal and an exclusive-OR signal of a first input signal (x1) and a second input signal (x2) 
from the same digit place are formed, and when the exclusive-OR signal is a first value ("1 "). a third input signal 
(x3) from the same digit place is output as the intermediate carry-out signal (Cout). while when the exclusive- 
OR signal is a secorxj value ("0"). the AND or NAND signal is output as the intermediate carry-out signal 

20 (Cout). 

3. An adder circuit as claimed in claim 1 or 2. wherein a circuit for generating the sum signal (Si. j) by exdusive-ORing 
the five input signals (x1, x2. x3. x4. Cin) comprises: 

25 r a first exclusive-OR circuit (450. 450'), constructed from a single-transfer-gate circuit, for exdusive-ORing the 
exdusive-OR signal of the first and second input signals' (x1 , x2) with the exdusive-OR signal of the third and 
• . fourth signals (x3, x4) from the same digit.place; and 

a plurality of second exdusive-OR ^drcuits (440. 460), each constructed from a drive gate circuit or a obmple- 
mentary-transfer-gate drcuit, for exdusive-ORing the other signals. 

30 . : • •.• - ^- :v ■ .. • . ... . - ■ 

4. An adder drcuit as claimed in claim 3; wherein said first exclusive-OR circuit (450*) comprises six transistors. 

5. An adder drcuit which takes, for each digit place, four input signals (x1 . x2. x3. x4) and one intermediate carry-in 
signal (Cin). and generates one intermediate carry-out signal (Cout) along with a sum signal (Si, j) and a can-y sig- 

35 nal (Ci. j) for output, wherein a circuit for generating the sum signal (Si. j) by exdusive-ORing the five input signals 
(x1. x2, x3, x4. Cin) comprises: 

a first exclusive-OR circuit (450. 450'), constructed from a single-transfer-gate circuit, for exdusive-ORing the 
exdusive-OR signal of the first and second input signals (x1, x2) with the exdusive-OR signal of the third and 
40 fourth signals (x3, x4) from the same digit place; and 

a plurality of second exclusive-OR circuits (440, 460). each constructed from a drive gate circuit or a comple- 
mentary-transfer-gate drcuit, for exdusive-ORing the other signals. 

6. An adder drcuit as claimed in claim 5. wherein said first exclusive-OR circuit (450') comprises six transistors. 

45 

7. A digital multiplier circuit containing as part thereof an adder drcuit as claimed in any one of the preceding claims. 

8. A digital multiplier circuit comprising: 

so an encoder (205; 1 2) for receiving a multiplier bit signal (bj) and for outputting a plurality of encode signals; and 

a partial product bit generating circuit (14) for receiving the encode signals along with a muitipiicand bit signal 
(ai. /ai) from each digit place and for generating a partial product bit for each digit place, said partial product bit 
generating circuit (14) including a first selection circuit (201, 203) for selecting a logically true signal from 
among the encode signals in accordance with a value of the multiplicand bit signal. 

ss 

9. A multiplier drcuit as claimed in claim 8, wherein the multiplicand bit signal and its inverted signal (ai, /ai) are sup- 
plied to said partial product bit generating drcuit (14). 
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10. A multiplier circuit as claimed in claim 8. wherein said encoder (205; 12) is a Booth encoder. 

11. A multiplier circuit as claimed in claim 8, wherein tiie encode signals (B1j, B2j; Mj. PLj) to be selected by said first 
selection circuit (201. 203) are signals identifying whether a necessary signal as an encoded result Is the multipli- 
cand bit signal itself or its inverted signal. 

12. A multiplier circuit as claimed in claim 11. wherein said partial product bit generating circuit (14) is loaded with mul- 
tiplicand bit signals (ai. /ai; ai+1 . /ai+1) from a plurality of digit places; and 

said partial product bit generating circuit further includes a second selection circuit (202, 204) for selecting, 
from among the plurality of signals selected based on each multiplicand bit signal, a signal that matches tiie 
result of encoding in accordance with an encode signal (Xj. 2Xj) different from the selected signals. 

13. A multiplier circuit as claimed in daim 12. wherein each of said first and second selection circuits (201, 203; 202, 
204) is constructed from two AND circuits (301, 302) and one NOR circuit (303). 

14. A multiplier circuit as claimed in daim 12, wherein each of said first and second selection circuits (201, 203; 202, 
204) is constructed from transfer gates. 

15. A multiplier circuit as claimed in daim 14, wherein each of said first and second selection circuits (201, 203; 202, 
204) is constructed from two transfer gates. 

1 6. A multiplier circuit as claimed in any one of daims 8 to 1 5, wherein multiplicand bit signal lines for transferring com- 
plementary multiplicand bit signals corresponding to multiplicand digits are arranged in parallel to each other 
extending in a first direction in a two-dimensional plane, and sets of encode signal lines con-esponding to multiplier 
digits are arranged extending in a second direction tiiat intersects said first direction, while said partial product bit 
generating circuit (1 4) is repeatedly arranged in order to contain a plurality of predetermined adjacent intersections 
of said multiplicand bit signal lines and said encode signal lines. 

17. A multiplier circuit utilizing a Bootii algorithm, comprising a circuit which, instead of a partial product bit signal in 
accordance witti the Bootii algoritiim, generates for each digit place a bit signal corresponding to a sum of a cor- 
rection value for twos conrplement of a most significant partial product and a binary number represented by a bit in 
a sign digit of a least significant partial product and bits from a least significant digit of said most significant partial 
product to a digit one position lower than the sign digit of said least significant partial product. 

18. A multiplier circuit utilizing a method for avoiding sign extension by correction processing, comprising a circuit which 
performs addition of one for sign correction in a digit place one position higher than a sign digit of each partial prod- 
uct, wherein an intermediate carry-out signal as a summation output for a digit place containing said sign digit, or 
a can-y signal itself, is added in a digit place two positions higher, and a NOT signal thereof is added in a digit place 
one position higher. 

19. A multiplier circuit utilizing a Bootii algorithm and also utilizing a mettiod for avoiding sign extension by confection 
processing, comprising: 

a circuit which, instead of a partial product bit signal In accordance with tiie Booth algorithm, generates for 
each digit place a bit signal corresponding to a sum of a correction value for twos complement of a most sig- 
nificant partial product and a binary number represented by a bit in a sign digit of a least significant partial prod- 
uct and bits from a least significant digit of said most significant partial product to a digit one position lower than 
tiie sign digit of said least significant partial product; and 

a drcurt which performs addition of a 1 for sign correction in a digit place one position higher than a sign digit 
of each partial product, wherein an intermediate cary-out signal as a summation output for a digit place con- 
taining said sign digit, or a carry signal itself, is added in a digit place two positions higher, and a NOT signal 
tiiereof is added in a digit place one position higher. 

20. A large-scale semiconductor integrated circuit wherein a multiplier circuit, as claimed In any one of claims 7 to 19, 
is Integrated together with additional circuiti'y for implementing signal processing functions. 

21. A partial product bit compression method for a multiplier circuit utilizing a Booth algoritiim, wherein 
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instead of a partial product bit signal in accordance with the Booth algorithm, a bit signal corresponding to a 
sum of a correction value for twos complement of a most significant partial product and a binary number rep- 
resented by a bit in a sign digit of a least significant partial product and bits from a least significant digit of said 
most significant partial product to a digit one position lower than the sign digit of said legist significant partial 
5 product, is generated directly for each digit place. 

22. A partial product bit compression method for a multiplier circuit utilizing a method for avoiding sign extension by cor-- 
rection processing, wherein 

10 addition of one for sign correction is performed in a digit place one position higher than a sign digit of each par- 

tial product, and an intermediate carry-out signal as a summation output for a digit place containing said sign 
digit, or a carry signal itself, is added in a digit place two positions higher, while a NOT signal thereof is added 
in a digit place one position higher. 

IS 
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